Speeding Up Multi-class SVM Evaluation by PCA and Feature Selection
نویسندگان
چکیده
Support Vector Machine (SVM) is the state-of-art learning machine that has been very fruitful not only in pattern recognition, but also in data mining areas, such as feature selection on microarray data, novelty detection, the scalability of algorithms, etc. SVM has been extensively and successfully applied in feature selection for genetic diagnosis. In this paper, we do the contrary,i.e., we use the fruits achieved in the applications of SVM in feature selection to improve SVM itself. By reducing redundant and non-discriminative features, the computational time of SVM is greatly saved and thus the evaluation speeds up. We propose combining Principal Component Analysis (PCA) and Recursive Feature Elimination (RFE) into multi-class SVM. We found that SVM is invariant under PCA transform, which qualifies PCA to be a desirable dimension reduction method for SVM. On the other hand, RFE is a suitable feature selection method for binary SVM. However, RFE requires many iterations and each iteration needs to train SVM once. This makes RFE infeasible for multi-class SVM if without PCA dimension reduction,especially when the training set is large. Therefore, combining PCA with RFE is necessary. Our experiments on the benchmark database MNIST and other commonly-used datasets show that PCA and RFE can speed up the evaluation of SVM by an order of 10 while maintaining comparable accuracy.
منابع مشابه
Speeding Up Multi-class SVM Evaluation via Principle Component Analysis and Recursive Feature Elimination
Support Vector Machines (SVM) have been shown to yield state-of-the-art performance in many pattern analysis applications. Feature selection methods for SVMs are often used to reduce the complexity of learning and evaluation. In this article we propose to combine a standard method, Recursive Feature Elimination (RFE), with Principal Component Analysis (PCA) to produce a multi-class SVM framewor...
متن کاملFeature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine
Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...
متن کاملFeature selection using genetic algorithm for classification of schizophrenia using fMRI data
In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...
متن کاملMULTI CLASS BRAIN TUMOR CLASSIFICATION OF MRI IMAGES USING HYBRID STRUCTURE DESCRIPTOR AND FUZZY LOGIC BASED RBF KERNEL SVM
Medical Image segmentation is to partition the image into a set of regions that are visually obvious and consistent with respect to some properties such as gray level, texture or color. Brain tumor classification is an imperative and difficult task in cancer radiotherapy. The objective of this research is to examine the use of pattern classification methods for distinguishing different types of...
متن کاملDimensionality Reduction for Using High-Order n-Grams in SVM-Based Phonotactic Language Recognition
SVM-based phonotactic language recognition is state-of-the-art technology. However, due to computational bounds, phonotactic information is usually limited to low-order phone n-grams (up to n = 3). In a previous work, we proposed a feature selection algorithm, based on n-gram frequencies, which allowed us work successfully with high-order n-grams on the NIST 2007 LRE database. In this work, we ...
متن کامل